How does removing the low samples (Day 8 and CFUs around 0) affect prediction?
Removing these data points improves predictability for day 6 and 8 but not the earlier days. If I increase the cutoff to remove all CFU below 5, there is a slight increase in day 3 cfu classification at the cost of decrease day 1 cfu classification.
Otu00004 Otu00005 Otu00015 Otu00019 Otu00030 Otu00199 Otu00200 Otu00250
14 17 13 15 13 16 17 19
Boruta Confirmed the following OTUs as important for predicting day 9/10 cfu:
Otu00004 Otu00005 Otu00015 Otu00030 Otu00081 Otu00199 Otu00200 Otu00250 Otu00297
21 22 21 21 21 22 22 22 20
Selecting OTUs through collecting the features from the most predictive community/cfu models (R^2 >= 0.6 and MSE <= 0.8), then converting all % Increase in MSE to relative values and taking the median value of of each OTU, then selecting OTUs that fall above the median value results in the following OTUs:
“Otu00001” “Otu00002” “Otu00004” “Otu00005” “Otu00012” “Otu00014” “Otu00015” “Otu00016” “Otu00019” “Otu00028” “Otu00030” “Otu00048” “Otu00081” “Otu00101” “Otu00118” “Otu00199” “Otu00200” “Otu00250” “Otu00297”